Word Sense Induction for Novel Sense Detection

نویسندگان

  • Jey Han Lau
  • Paul Cook
  • Diana McCarthy
  • David Newman
  • Timothy Baldwin
چکیده

We apply topic modelling to automatically induce word senses of a target word, and demonstrate that our word sense induction method can be used to automatically detect words with emergent novel senses, as well as token occurrences of those senses. We start by exploring the utility of standard topic models for word sense induction (WSI), with a pre-determined number of topics (=senses). We next demonstrate that a non-parametric formulation that learns an appropriate number of senses per word actually performs better at the WSI task. We go on to establish state-of-the-art results over two WSI datasets, and apply the proposed model to a novel sense detection task.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Automatic Biomedical Term Polysemy Detection

Polysemy is the capacity for a word to have multiple meanings. Polysemy detection is a first step for Word Sense Induction (WSI), which allows to find different meanings for a term. The polysemy detection is also important for information extraction (IE) systems. In addition, the polysemy detection is important for building/enriching terminologies and ontologies. In this paper, we present a nov...

متن کامل

From the Culinary to the Political Meaning of "quenelle" : Using Topic Models For Identifying Novel Senses (De la quenelle culinaire à la quenelle politique : identification de changements sémantiques à l'aide des Topic Models) [in French]

In this study we explore topic modeling for the automatic detection of new senses of known words. We apply methods developed in previous work for English (Lau et al., 2012, 2014) on a recent case of new word sense induction in French, namely the appearence of the new meaning of gesture for the word « quenelle ». Our experiments illustrate the potential of this approach at learning word senses, ...

متن کامل

Word Sense Induction by Community Detection

Word Sense Induction (WSI) is an unsupervised approach for learning the multiple senses of a word. Graph-based approaches to WSI frequently represent word co-occurrence as a graph and use the statistical properties of the graph to identify the senses. We reinterpret graph-based WSI as community detection, a well studied problem in network science. The relations in the co-occurrence graph give r...

متن کامل

Word sense induction using word embeddings and community detection in complex networks

Word Sense Induction (WSI) is the ability to automatically induce word senses from corpora. The WSI task was first proposed to overcome the limitations of manually annotated corpus that are required in word sense disambiguation systems. Even though several works have been proposed to induce word senses, existing systems are still very limited in the sense that they make use of structured, domai...

متن کامل

Prédiction de la polysémie pour un terme biomédical

Polysemy is the capacity for a term to have multiple meanings. Polysemy prediction is a first step for Word Sense Induction (WSI), which allows to find different meanings for a term, as well as for Information Extraction (IE) systems. In addition, the polysemy detection is important for building and enriching terminologies and ontologies. In this paper, we present a novel approach to detect if ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012